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Abstract 

In this paper we introduce, for the first time, the Weibull-Geometric distribution 
which generalizes the exponential-geometric distribution proposed by Adamidis and 
Loukas (1998). The hazard function of the last distribution is monotone decreasing 
but the hazard function of the new distribution can take more general forms. Unlike 
the Weibull distribution, the proposed distribution is useful for modeling unimodal 
failure rates. We derive the cumulative distribution and hazard functions, the den- 
sity of the order statistics and calculate expressions for its moments and for the 
moments of the order statistics. We give expressions for the Renyi and Shannon 
entropies. The maximum likelihood estimation procedure is discussed and an algo- 
rithm EM (Dempster et al., 1977; McLachlan and Krishnan, 1997) is provided for 
estimating the parameters. We obtain the information matrix and discuss inference. 
Applications to real data sets are given to show the flexibility and potentiality of 
the proposed distribution. 

keywords: EM algorithm; Exponential distribution; Geometric distribution; 
Hazard function; Information matrix; Maximum likelihood estimation; Weibull dis- 
tribution 

1 Introduction 

Several distributions have been proposed in the literature to model lifetime data. Adamidis 
and Loukas (1998) introduced the two-parameter exponential-geometric (EG) distribu- 
tion with decreasing failure rate. Kus (2007) introduced the exponential-Poisson distri- 
bution (following the same idea of the EG distribution) with decreasing failure rate and 
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discussed various of its properties. Marshall and Olkin (1997) presented a method for 
adding a parameter to a family of distributions with application to the exponential and 
Weibull families. Adamidis et al. (2005) proposed the extended exponential-geometric 
(EEG) distribution which generalizes the EG distribution and discussed various of its 
statistical properties along with its reliability features. The hazard function of the EEG 
distribution can be monotone decreasing, increasing or constant. 

The Weibull distribution is one of the most commonly used lifetime distribution in 
modeling lifetime data. In practice, it has been shown to be very flexible in modeling 
various types of lifetime distributions with monotone failure rates but it is not useful 
for modeling the bathtub shaped and the unimodal failure rates which are common in 
reliability and biological studies. In this paper we introduce a WeibuU-geometric (WG) 
distribution which generalizes the EG and Weibull distributions and study some of its 
properties. The paper is organized as follows. In Section 2, we define the WG distribution 
and plot its probability density function (pdf). In Section 3, we give some properties of 
the new distribution. We obtain the cumulative distribution function (cdf), survivor 
and hazard functions and the pdf of the order statistics. We also give expressions for 
its moments and for the moments of the order statistics. The estimation by maximum 
likelihood using the algorithm EM is studied in Section 4 and inference is discussed in 
Section 5. Illustrative examples based on real data are given in Section 6. Finally, Section 
7 concludes the paper. 

2 The WG distribution 

The EG distribution (Adamidis and Loukas, 1998) can be obtained by compounding an 
exponential with a geometric distribution. In fact, if X follows an exponential distribution 
with parameter f3Z, where Z is a geometric variable with parameter p, then X has the 
EG distribution with parameters {P,p)- Since the Weibull distribution generalizes the 
exponential distribution, it is natural to extend the EG distribution by replacing in the 
above compounding mechanism the exponential by the Weibull distribution. 

Suppose that {Yj}f^^ are independent and identically distributed (iid) random vari- 
ables following the Weibull distribution W{P, a) with scale parameter /3 > 0, shape 
parameter a > and pdf 

g{x; f3, a) = a/?"x°-^e-(^^)'' , x > 0, 

and a discrete random variable having a geometric distribution with probability func- 
tion P{n;p) = (1 — p)p"~^ for n eN and p G (0, 1). Let X = minlyj}^^. The marginal 
pdf of X is 

/(x;p,/?,a) = a/?'"(l -p)x°-^e-(^^)"{l -pe-(^^)"}-2, x > 0, (1) 

which defines the WG distribution. It is evident that ([T]) is much more flexible than 
the Weibull distribution. The EG distribution is a special case of the WG distribution 
for a = 1. When p approaches zero, the WG distribution leads to the Weibull W{P, a) 
distribution. Figure [H plots the WG density for some values of the vector = (/?, a) when 
p = 0.01, 0.2, 0.5, 0.9. For all values of parameters, the density tends to zero as x — oo. 
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Figure 1: Pdf of the WG distribution for selected values of the parameters. 

For a > 1, the WG density is unimodal (see appendix A) and the mode — (5~'^u^J°' 
is obtained by solving the nonlinear equation 

\ a J a 

The pdf of the WG distribution can be expressed as an infinite mixture of WeibuU dis- 
tributions with the same shape parameter a. If < 1 and A; > 0, we have the series 
representation 
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Expanding {1 —pe ('^^)"} ^ as in ([3]), we can write (P) as 

oo 
j=0 

From the WeibuU pdf given before, we have 

oo 

f{x;p, (3, a) = (1 - p) Y.p^g{x; (3{j + 1)^/", a). (4) 

Hence, some mathematical properties (cdf, moments, percentiles, moment generating 
function, factorial moments, etc.) of the WG distribution can be obtained using (jll) from 
the corresponding properties of the WeibuU distribution. 



3 Properties of the WG distribution 

3.1 The distribution and hazard functions and order statistics 

Let X be a random variable such that X follows the WG distribution with parameters 
p, P and a. In the sequel, the distribution of X will be referred to WG{p, P, a). The cdf 
is given by 

The survivor and hazard functions are 

S{x) = \ ^J' , x>0 (6) 



and 



respectively. 



= -pe-(^")"}-\ x>0, (7) 



The hazard function ([7j) is decreasing for < a < 1. However, for a > 1 it can 
take different forms. As the WG distribution converges to the Weibull distribution when 
p — ^ O"*", the hazard function for very small values of p can be decreasing, increasing and 
almost constant. When p — >■ 1~, the WG distribution converges to a distribution degen- 
erate in zero. Hence, the parameter p can be interpreted as a concentration parameter. 
Figure [2] illustrates some of the possible shapes of the hazard function for selected values 
of the vector = [jS, a) when p = 0.01, 0.2, 0.5 and 0.9. These plots show that the hazard 
function of the new distribution is quite flexible. 
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Figure 2: Hazard rate function of the WG distribution for selected values of the param- 
eters. 

We now calculate the pdf of the order statistics. Let Xi, . . . , X„ be random variables 
iid such that Xj ~ WG{p, /5, a) for i = 1, . . . , n. The pdf of the ith order statistic, Xj.„ 
say, is given by (for x > 0) 

J-'r^y"")- B{i,n^i + l) {i_pe-('3-)"}«+i' 

where B{a,b) = uj"-'^^{l — uj)^'^duj denotes the beta function. Let gi:n{x) be the pdf 
of the ith WeibuU order statistic with parameters /3 and a given by 

^'■"^ ' B{i,n-i + l) ^ ^ 
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Equation ^ can be rewritten in terms of gi:n{x) as 

Further, we can express the pdf of Xi.n as a mixture of WeibuU order statistic densities. 
Using (I3]) in ([8]), we obtain 

Hence, using ([9]), some mathematical properties for the order statistics of the WG dis- 
tribution can be immediately obtained from the corresponding properties of the Weibull 
order statistics. 

3.2 Quantiles and moments 

The quantile 7 (x^) of the WG distribution follows from ([5]) as 

1 -P7~ 



_ l/a 

= 13^^ { loj 
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In particular, the median is simply 0:0.5 = /5~^{log(l — p)}^/". 
The rth moment of X is given by 
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Expanding the term {1—pe (^^■)"} ^ as in ([3]) yields 

(i-p)r|r^ 

where L{p;a) = Yl'jLiP'^ j'"" is Euler's polylogarithm function (see, Erdelyi et al., 1953, 
p. 31) which is readily available in standard software such as Mathematica. 

Figure [3] plots the skewness and kurtosis of the WG distribution as functions of p for 
(3 = 1 and some values of a. When p — > 1~, the coefficients of skewness and kurtosis tend 
to zero as expected, since the WG distribution converges to a degenerate distribution (in 
zero) when p ^ 1^ . 

The rth moment of the ith order statistic is given by 

p(yr s _ Q;/?"(1 -p)" f°° o,+r-l -{n-i+l){l3xr " g"^^""^ Y'^ ■ 

^K^i:n}- B{i,n-i + l) J, {i_pe-(/3-)'^}-+i 

Expanding the term {1 — pe"'-^^^"}^'-"^^'' as in and using the binomial expansion for 
{1 — e~^^^)"}*~^, the rth moment of Xj:„ becomes 

P,^. ^ (1 -p)-^^r(r/a + 1) f.^ {-irrj)^.')!^ 
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Figure 3: Skewness and kurtosis of the WG distribution as functions of p for (3 = 1 and 
some values of a. 

We now give an alternative expression to (|TU|) by using a result due to Barakat and 
Abdelkader (2004). We have 

k=n-i+l ^ / \ J Jo 

where S{x) is the survivor function ([6]). 

Using the expansion ([3D and changing variables u = {k + j)(/5x)°, we have 

rx^-'S{xfdx = {l-p)'f](''t^~^)pi Tx^-'e-^'+^^^^^^'^dx 
Jo i=o ^ k - 1 J Jq 



Hence, 



^ r(r/a + 1) ^ ,A . .kfn\(k-l\fk + j-i y{l-py 

^ i-l)n-^+lf3r2-.^Z^^^^^ ) {k J {u - ^ J { k - 1 J {k + jY/^ ' 

(11) 



Expressions ( fTOl) and ( ITTil give the moments of the order statistics and can be compared 
numerically. Table [1] gives numerical values for the first four moments of the order 
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statistics Xi:i5, Xr^s and Xi5:i5 from (ITOl) and ( fTTl) with the index j stopping at 100 and 
by numerical integration. We take the parameter values as p = 0.8, (3 = 0.4 and a = 2. 
The figures in this table show good agreement among the three methods. 



Xi:i5 i rth moment— > 


r=l r=2 r=3 r=4 


F/Xnrpssion ffTOl) 
i = 1 Expression (ITTl) 
Numerical 


25717 08697 035116 016265 
0.25717 0.08697 0.035116 0.016265 
0.26102 0.08795 0.035408 0.016364 


ExDression fllOl) 
i = 7 Expression (ITTl) 
Numerical 


96660 98827 1 06643 1 21249 
0.98502 0.99295 1.06784 1.21298 
0.96674 0.98836 1.06649 1.21253 


Expression (llOp 
i = 15 Expression (|TT]) 
Numerical 


3.33109 11.97872 46.35371 192.32090 
3.33126 11.97875 46.35375 192.32090 
3.33126 11.97875 46.35375 192.32090 



Table 1: First four moments of some order statistics from ( ITOi) and ( fTTl) and by numerical 
integration. 



3.3 Renyi and Shannon entropies 

Entropy has been used in various situations in science and engineering. The entropy of a 
random variable X is a measure of variation of the uncertainty. Renyi entropy is defined 
by /i?(7) = l°s{/]R P{.x)dx}, where 7 > and 7 7^ 1. By using ([3]), we have 

If(a — 1)(7 — 1)>0, this expression reduces to 

rK a \A r(a)[a(l -p)]^ ^ .r(27 + j) 
J V ; /3°(i-7)r(27) ^ 

where Yj follows a gamma distribution with scale parameter (7 + j)^^" and shape param- 
eter a. Hence, we obtain 

T i \ = ^\ ^ f - p)]"r(7(« - 1) + 1) p^r(27 + i) 

1-7 ^1 /5^-^r(27) ^ + j){"-i)(7-i)/«+i 

Shannon entropy is defined as log[/(X)]}. This is a special case obtained from 
lim^^i /ij(7). Then, 

E\- log /(X)] = - \o^\ari\ -V)\ - («- l)i? [log(X)] +/3"E(X") -2i?{log[l -pe-(^^)"]}. 
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We can show that 

E[log{X)] = ij{l)/a, 

= -^log(l-p), 

£;{log[l-pe-(^^)n} = -^{l + (l-p)[l + log(l-p)]}. 
Hence, the Shannon entropy reduces to 

^[-log/(X)] = -log[a/3"(l -p)] - ^^(1) - - 2p+ (3 - 2p) log(l - p)]. 

4 Estimation 

Let random sample of the WG distribution with unknown parameter 

vector 9 — (p, (3, a). The log hkelihood £ = l{9] x) for 9 is 

n n 

^ = n[logQ; + Q;log/? + log(l-p)] + (a-l)5^1og(xi)-^(/9a;i)" 

1=1 i=i 

n 

- 2^1og[l-pe-(^-^)"]- 

i=l 

The score function U{9) — {dl/dp,dl/ dl3,dl/daY has components 

1=1 

1^ = nQ;/3-i-a/3-i J^<{l + 2pe-(''-^)"[l-pe-(^=^^)"]"'}, 

i=l 

(9/ n n 

_ = na-^ + J]log(/3a;,) - log(/3a;,){l + 2p e-^^^^^" [I - p e^^^^^^"]-^} . 

i=l i=l 

The maximum likelihood estimate (MLE) of is calculated numerically from the 
nonlinear equations U{9) = 0. We use the EM algorithm (Dempster et al., 1977; McLach- 
lan and Krishnan, 1997) to obtain 9. For doing this, we define an hypothetical complete- 
data distribution with density function 

fix, z; 9) = a/?°(l - p)2x"-y-^e-^(^^)", 

for X, /3, q; > 0, p e (0,1) and 2; e N. Under this formulation, the E-step of an EM 
cycle requires the conditional expectation of {Z\X;9'^''^). where 9'^'^^ = {p^^\(3^^\ aW) is 
the current estimate of 9. Using that P{z\x]9) = zp^^^e^^^^^^^^^^^" x {1 — pe^^f^^^"}"^ for 
z e N,it follows E{Z\X;9) = {1 +pe-(^^)"}{l -pe-^'^"^)"}^^ The EM cycle is completed 
with the M-step by using the maximum likelihood estimation over 9, with the missing Z's 
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replaced by their conditional expectations given above. Hence, an EM iteration reduces 

to ^^^^^ 

Ei=i I i=i 

where q;('"+^) is the solution of the nonlinear equation 



n 



n sr^n (r) 



. 1 Wi xf log Xi 



i=l Z^i=l -^i 

where 

w- = 

1 _ pMe-C/sWa^O"^"^' ■ 

An implementation of this algorithm using the software R is given in Appendix B. 

5 Inference 

For interval estimation and hypothesis tests on the model parameters, we require the 
information matrix. The 3x3 observed information matrix J„ = J„(^) is given by 





^ Jpp 


Jp/3 


Jpa \ 


Jn 






Jaa 




\ Jpa 


J 13a 


Jaa J 



where 



Jpp — a„2 ~ ^ -^0,0,2,2 ^(1 P) 



dp 

1. 

dpda 

dH 
dpd/3 

dH 



2p " 



i=l 

-Jaa = ^^-nor' + jyp'f^'^T^^^ + 2p(^-T!il^^-p-T^^^- 



i=l 

Darp{i) 



dH 

dadp 

2paP'--'J2(pT2l2,2 + T^li,i), 



i=l 
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9^ 



i=l 



Here, 

for {j,k,l,m) e {0,1,2} and i — l,...,n. Under conditions that are fulfilled for the 
parameter 9 in the interior of the parameter space but not on the boundary, the asymp- 
totic distribution of ^/n{6 — 6) is multivariate normal iV3(0. where K(9) = 
limn^oon~^Jn{d) is the unit information matrix. This asymptotic behavior remains valid 
if K{6) is replaced by the average sample information matrix evaluated at 6, i.e., n~^Jn{0). 
We can use the asymptotic multivariate normal iV3(0, Jn{d)~^) distribution of 6 to cons- 
truct approximate confidence regions for some parameters and for the hazard and survival 
functions. In fact, an 100(1 — 7)% asymptotic confidence interval for each parameter 6i 
is given by 

where J^'^' represents the diagonal element of JniO)"^ for i = 1,2,3,4 and Zj/2 is 
the quantile 1 — 7/2 of the standard normal distribution. 

The asymptotic normality is also useful for testing goodness of fit of the three param- 
eter WG distribution and for comparing this distribution with some of its special sub- 
models via the likehhood ratio (LR) statistic. We consider the partition 9 = {9f, ^J)^, 
where 9i is a subset of parameters of interest of the WG distribution and ^2 is a subset of 
the remaining parameters. The LR statistic for testing the null hypothesis Hq : 9i = 9^^ 
versus the alternative hypothesis Hi : 9i ^ 9f^ is given hj w — 2{£(^) — £{9)}, where 
9 and 9 denote the MLEs under the null and alternative hypotheses, respectively. The 
statistic w is asymptotically (as n cxo) distributed as x^, where k is the dimension of 
the subset 9i of interest. For example, we can compare the EG model against the WG 
model by testing Hq : a — 1 versus Hi : a 1 and the WeibuU model against the WG 
model by testing Hq : a — l,p — versus Hi : Hq is false. 



6 Applications 

In this section, we fit the WG models to two real data sets. The first data set consist of 
the number of successive failures for the air conditioning system of each member in a fieet 
of 13 Boeing 720 jet airplanes. The pooled data with 214 observations were first analyzed 
by Proschan (1963) and discussed further by Dahiya and Gurland (1972), Gleser (1989), 
Adamidis and Loukas (1998) and Kus (2007). The second data set is an uncensored data 
set from Nichols and Padgett (2006) consisting of 100 observations on breaking stress of 
carbon fibres (in Gba). 
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For the first data set, the estimated parameters using an EM algorithm were p = 
0.7841, P = 0.0048 and a = 1.2246. The fitted pdf and the estimated quantiles versus 
observed quantiles are given in Figures HI This figure shows a good fit of the WG model 
for the first data set. 




1 I I \ I \ I ^ \ I I I I 

100 200 300 400 500 600 100 200 300 400 500 600 



Quantil Empirico 

Figure 4: Plots of the fitted pdf and of the estimated quantiles versus observed quantiles 
for the first data set. 

For the second data set, the estimates obtained using an EM algorithm are p = 0.3073, 
P = 0.3148 and a = 3.0093. The plot of the fitted pdf and the estimated quantiles versus 
observed quantiles in Figure [5] shows a good fit of the WG model. 

7 Conclusions 

We define a new model so called the Weibull-geometric (WG) distribution that generalizes 
the exponential-geometric (EG) distribution introduced by Adamidis and Loukas (1998). 
Some mathematical properties are derived and plots of the pdf and hazard functions are 
presented to show the flexibility of the new distribution. We give closed form expressions 
for the moments of the distribution. We obtain the pdf of the order statistics and provide 
expansions for the moments of the order statistics. Estimation by maximum likelihood 
is discussed and an algorithm EM is proposed. We discuss inference, give asymptotic 
confldence intervals for the model parameters and present the use of the LR statistic to 
compare the flt of the WG model with special sub-models. Finally, we fltted WG models 
to two real data sets to show the flexibility and the potentially of the new distribution. 
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Figure 5: Plots of the fitted pdf and of the estimated quantiles versus observed quantiles 
for the second data set. 

Appendix A 

We now show that the WG density is unimodal when a > 1. Let 

h{u) = u + pe"" + - — . 

When u — > 0"*", h{u) — > and when u oo, h{u) —>■ oo. Thus, if h{u) is an 

increasing function, the solution in ([2]) is unique and the WG distribution is unimodal. 
We have h'{u) = 1 — pe~" (u + +]?e~". Using the inequalities — pe~"^^ > —pe~^ 
and —pe''^u > e— 1, it follows that h'{u) > 1 — > 0, Vm > 0. Hence, h{u) is an 
increasing function and the WG distribution is unimodal if a > 1. 

Appendix B 

The following R function estimates the model parameters p, j3 and a through an EM 
algorithm. 

f it .WG<-f unction (x, par ,tol=le-4,maxi=100){ 

# X Numerical vector of data. 

# par Vector of initial values for the parameters p, beta and alpha 

# to be optimized over, on this exactly order. 

# tol Convergence tolerance. 
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#mcLxi Upper end point of the interval to be searched. 

p<-par [1] 

beta<-par [2] 

alpha<-par [3] 

n<-length(x) 

z . temp<-f unctionO { 

(l+p*exp (- (beta*x) "alpha) ) / ( l-p*exp (- (beta*x) "alpha) ) 
} 

alpha. sc<-f unction (alpha) { 

n/alpha+sum(log(x) )-n*suin(z*x''alpha*log(x) )/suin(z*x'"alpha) 
} 

test<-l 

while (test >tol){ 
z<-z.temp() 

alpha . new<- (alpha . sc , interval=c (0 ,maxi) ) $root 
beta . new<- (n/ sum (x^alpha . new*z) ) " ( 1/alpha . new) 
p . new<-l-n/ sum (z) 

test<-max (abs (c ( ( (alpha . new-alpha) ) , 
( (beta . new-beta) ) , 
((p.new-p))))) 

alpha<-alpha . new 
beta< -beta. new 
p<-p . new 
} 

c(p, beta, alpha) 
} 
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